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This review focuses on the important contributions that macromolecular 
crystallography has made over the past 12 years to elucidating structures 
and mechanisms of the essential proteases of coronaviruses, the main pro¬ 
tease (M pro ) and the papain-like protease (PL pro ). The role of X-ray crys¬ 
tallography in structure-assisted drug discovery against these targets is 
discussed. Aspects dealt with in this review include the emergence of the 
SARS corona virus in 2002-2003 and of the MERS coronavirus 10 years 
later and the origins of these viruses. The crystal structure of the free 
SARS coronavirus M pro and its dependence on pH is discussed, as are 
efforts to design inhibitors on the basis of these structures. The mechanism 
of maturation of the enzyme from the viral polyprotein is still a matter of 
debate. The crystal structure of the SARS coronavirus PL pro and its com¬ 
plex with ubiquitin is also discussed, as is its orthologue from MERS coro¬ 
navirus. Efforts at predictive structure-based inhibitor development for bat 
coronavirus M pro s to increase the preparedness against zoonotic transmis¬ 
sion to man are described as well. The paper closes with a brief discussion 
of structure-based discovery of antivirals in an academic setting. 


(Received 6 May 2014, revised 7 July 2014, 
accepted 15 July 2014) 


doi: 10.1111/febs. 12936 


SARS - a decade on 

Eleven years ago, the world was shocked by the out¬ 
break of the severe acute respiratory syndrome 
(SARS), which spread from its origin in the Southern 
Chinese province of Guangdong to Hong Kong and 
from there to about 30 countries in the world, of 
which Vietnam, Singapore, Taiwan and Canada (Tor¬ 
onto) were most affected. Also, the virus travelled 
from Hong Kong to Beijing, where alone more than 
3000 SARS cases were recorded. Altogether, about 


8000 cases have been registered worldwide, of whom 
about 10% did not survive. SARS was characterized 
by an atypical, severe pneumonia (for recent reviews 
commemorating the 2003 SARS outbreak and discuss¬ 
ing the lessons learned, see [1^1]). 

On 24 March 2003 a new coronavirus, appropriately 
named SARS coronavirus (SARS-CoV), was described 
as the etiological agent causing the epidemic [5-8]. 
This virus was rapidly classified as an outlier of what 
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were called group 2 coronaviruses at the time [9]; 
according to the new nomenclature introduced a few 
years later (see for example [10]), SARS-CoV belongs 
to clade b of the genus Betacoronavirus. 

Newly discovered and newly emerging 
human coronaviruses 

Following the SARS epidemic, two new human coro¬ 
naviruses have been discovered due to intensified 
research efforts targeting this previously neglected virus 
family. In 2004 human coronavirus NL63, a member of 
the genus Alphacoronavirus, was described [11,12], fol¬ 
lowed by the discovery of HCoV HKU1, a clade-a beta¬ 
coronavirus, a year later [13]. These viruses are 
widespread but do not cause severe disease in the major¬ 
ity of people infected by them [14], In September 2012 
another novel human coronavirus. Middle East respira¬ 
tory syndrome (MERS) coronavirus, was described [15]. 
It had been detected in patients from Saudi Arabia and 
other countries on the Arab peninsula or in people who 
had a history of travel to the Middle East. The earliest 
cluster of MERS cases detected so far was in Jordan in 
April 2012, as shown retrospectively on the basis of 
patient samples. Symptoms of MERS include severe 
respiratory disease and often renal failure; as of 4 July 
2014, 827 laboratory-confirmed cases have been 

recorded, with 287 deaths (http://www.who.int). The 
case-fatality ratio of MERS is thus alarmingly high. 

Where did the SARS and MERS 
coronaviruses come from? 

In the case of SARS-CoV, wild animals such as palm 
civets, sold as a delicacy on Chinese ‘wet markets’, 
were initially identified as the immediate source of the 
virus [16], but from 2005 insectivorous Rhinolophid 
bats came into focus as the original reservoir, from 
where the virus was possibly transmitted to civets and 
other market species and from them to humans [17,18] 
(see [19] for a recent review on bat coronaviruses). 
However, it took until 2013 to discover a bat corona¬ 
virus that is more than 95% identical to SARS-CoV 
and uses the same receptor on the surface of host cells, 
the angiotensin-converting enzyme 2 (ACE2) [20], In 
the case of MERS coronavirus (MERS-CoV), bats 
were again suspected to be the reservoir as a few coro¬ 
naviruses with high sequence similarity to MERS-CoV 
were discovered in African and European bats [21,22], 
but in recent months the picture has changed some¬ 
what and dromedary camels are now the main suspects 
of being the reservoir from where the zoonotic trans¬ 
mission into the human population originates [23,24]. 
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After the SARS epidemic was over, many scientists 
and policy-makers, including even many virologists, 
believed that the event was unique and chances of 
repetition were extremely low. Thus, it must be said 
that more effort could (and should) have been made 
to develop small-molecule compounds with anti-coro- 
navirus activity; this was hampered, however, by a 
sharp decline in funding of coronavirus research in 
many countries after 2005-2006, and lack of support 
from the scientific community. As a consequence, not 
all lessons that the SARS outbreak taught us were 
taken seriously (discussed in [1]). But the recent - and 
still continuing - emergence of MERS-CoV has illus¬ 
trated that such an event can happen anywhere, at any 
time, given the large number of coronavirus species in 
Nature, of which we probably only know a fraction so 
far. Coronaviruses feature the largest RNA genome 
(about 30 kb; Fig. 1) known, and this genome is extre¬ 
mely flexible in terms of incorporation and deletion of 
gene products in response to evolutionary pressure 
such as the need to adapt to a new host. The coronavi¬ 
rus genome is also prone to recombination events, 
thereby adding further to its flexibility. 

The coronavirus main protease (M pro ) 

In this review, I will illuminate the question whether 
and how macromolecular crystallography contributed 
to the discovery of antivirals targeting proteins from 
the new viruses, SARS-CoV and MERS-CoV. In doing 
so, I will focus on the main antiviral drug targets, the 
coronavirus main protease (M pro , also called the SC- 
like protease, 3CL pro ) and the papain-like protease 
(PL pro ). Other enzymes of the coronaviruses, such as 
the helicase and the RNA-dependent RNA polymerase, 
are also targets for antiviral drug discovery, but such 
efforts are limited so far because of the lack of crystal 



j. PL’’ !I cleavage sites j, M 1 ”"cleavage sites 


Fig. 1 . Schematic presentation of the genome of the SARS 
coronavirus. Occupying two-thirds of the genome from the 5' end, 
open-reading frame 1 (ORF1) encodes two large polyproteins, ppla 
and, through ribosomal frameshifting during translation, pplab. 
These polyproteins are processed into mature Nsps by the two 
proteases discussed here (indicated in yellow). The main protease 
(M pr °, also called 3C-like protease, 3CL pra ) is Nsp5, whereas the 
papain-like protease (PL pra ) is a part of Nsp3. The PL pro performs 
three cleavage reactions (red arrows) to release Nspl, Nsp2 and 
Nsp3 (red), whereas the M pro cleaves the polyprotein at 11 sites 
(cyan arrows) to release Nsp4-Nsp16 (cyan). The 3'-terminal third 
of the genome codes for structural and accessory proteins. 
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structures for these enzymes (see [25] for a recent 
review). The coronaviral proteases M pro and PL pl ° are 
responsible for processing the huge polyproteins ppla 
and pplab, which are encoded by open reading frame 1 
(ORF1) of the coronavirus genome, into mature non- 
structural proteins (Nsps), most of which form part of 
the coronaviral replication/transcription complex 
(Fig. 1; for information on other SARS-CoV protein 
structures see [1,25-27]). 

The M pro is encoded by ORF1 as non-structural 
protein 5 (Nsp5) and is responsible for no less than 11 
cleavage sites in the polyproteins (Fig. 1). It is flanked 
by the proteins Nsp4 and Nsp6 which, along with 
parts of Nsp3, anchor the replication/transcription 
complex to double-membrane vesicles that are derived 
from the endoplasmic reticulum membrane during the 
infection [28], Substrate cleavage by the M pro follows 
the general pattern (small)-X-(L/F/M)-QJ.(G/A/S)-X 
(X = any amino acid; l cleavage site); in particular, 
the glutamine (Q) residue in the PI position of the 
substrate is an absolute requirement. As no host-cell 
proteases are known with this specificity, prospects for 
coming up with anti-coronavirals without too many 
side-effects are actually good. 

Crystallographic studies on 
coronavirus M pro prior to and during 
the SARS outbreak 

My group had started working on the coronavirus 
M pro around 1999. At that time, not a single crystal 
structure of a coronavirus protein had been 
determined. We first elucidated the crystal structure of 
the M pro of transmissible gastroenteritis virus (TGEV), 
a porcine coronavirus that is fatal for young piglets. 
Published in 2002 [29], the structure revealed that the 



M pro is a dimer (cf. Fig. 2) in which the N-terminus 
(the ‘N-finger’) of one monomer helps shape the SI 
substrate-specificity pocket and the oxyanion hole of 
the other monomer; hence, dimerization is a prerequi¬ 
site for catalytic activity. It also revealed the presence 
of an a-helical domain (domain III) in addition to 
domains I and II, which together feature a chymotryp- 
sin-like fold and harbor the catalytic Cys.. .His dyad 
between them. Subsequently, we synthesized a chlo- 
romethylketone inhibitor and cocrystallized it with the 
TGEV M pro in order to visualize the substrate-binding 
site in detail [30], At the same time, we also deter¬ 
mined the structure of the M pro of human coronavirus 
229E (HCoV 229E). When SARS-CoV was identified 
and sequenced in the spring of 2003, we built the first 
homology model of the SARS-CoV M pro on the basis 
of the structure of the enzyme from HCoV 229E [30]. 
We further suggested, on the basis of the binding 
mode of our chloromethylketone inhibitor, that the 
Michael acceptor compound AG7088 (rupintrivir), 
which was being developed by Pfizer as an inhibitor of 
the 3C protease of human rhinovirus [31], should be a 
good starting point for anti-SARS drug design [30,32], 
Later, this compound turned out not to have particu¬ 
larly high activity against SARS-CoV in cell culture, 
but derivatives of this Michael acceptor lead turned 
out to exhibit good anti-coronaviral activity in vitro 
and ex vivo [33-35]. Towards the end of the SARS 
outbreak in Beijing (in June 2003), the crystal structure 
of the SARS-CoV main protease itself was determined 
through a collaboration between the group of Zihe 
Rao in Beijing, who had recombinantly produced and 
crystallized the enzyme, and my group, both as the 
free protease (Fig. 2) and in complex with the chlo¬ 
romethylketone inhibitor that we had already used for 
the TGEV M pro [36]. 



Fig. 2. Stereo presentation of the structure of the SARS-CoV M pro dimer [36], The catalytic dyads of each subunit (Cys145.. ,His41) are 
indicated, as are the N- and C-termini. Note that the N-terminus of the cyan polypeptide chain is located close to the substrate-binding site 
of the purple subunit. 
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Influence of pH on the M pro structure 

The first structure of the free SARS-CoV M pro [36] 
was determined from crystals that had been grown at 
acidic pH (around 6.0); in this structure, one monomer 
of the M pro dimer was in the active state and the other 
one in a catalytically incompetent conformation in 
which the Si specificity pocket and the oxyanion hole 
were collapsed. When the same crystals were equili¬ 
brated in buffer of pH 7.4, both monomers were found 
in the active conformation, whereas at pH 8.0 the sub¬ 
strate-binding site was less well defined due to increas¬ 
ing flexibility of the amino acid side-chains involved. 
This phenomenon was explained by molecular dynam¬ 
ics simulations run with different protonation states 
for two key histidine residues (His 163 and His 172) 
involved in shaping the SI substrate-binding site [37]. 
The pH-activity profile of the SARS-CoV M pro was 
found to be very probably determined by protonation 
of His 163 (inactivation at acidic pH) and deprotona¬ 
tion of Hisl72 (inactivation at basic pH) [37]. The 
observation of the catalytically incompetent form (with 
the SI site and the oxyanion hole collapsed) has occa¬ 
sionally been ascribed (e.g. [38]) to the presence of five 
additional residues at the N-terminus that remained 
from the cloning procedure; the phenomenon has not 
been observed with enzyme featuring authentic chain 
termini when crystalized in space group C2 [39], We 
have determined structures of the SARS-CoV M pro 
with authentic chain termini from crystals grown with 
other symmetries and did observe the presence of both 
an active and an inactive monomer at low pH [40] 
(Verschueren et al., unpublished). The existence of a 
less active proform of the enzyme may allow control 
of the temporal order of processing the individual 
polyprotein cleavage sites to release intermediate and 
mature Nsps at the time in replication when they are 
needed. Unfortunately, the pH at the site of action of 
the M pro , at the endoplasmic-reticulum-derived dou¬ 
ble-membrane vesicles [28], is not known. 

How does M pro maturation work? 

Before auto-activation and liberation from the viral 
polyproteins ppla and pplab, the M pro (Nsp5) is an 
integral part of these polyproteins (Fig. 1). The mecha¬ 
nism of auto-activation of the enzyme is not well 
understood (see [38] for a review). Several studies have 
used constructs carrying fluorescent proteins at both 
termini of the SARS-CoV M pro and connected to the 
enzyme by peptide sequences containing M pro cleavage 
sites [41,42], Such polyprotein models are usually 
monomeric, but dimer formation upon addition of 
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substrates has been observed [41], We have found that 
upon mutation of three residues (Arg4, Glu290 and 
Arg298) involved in the monomer-monomer interface 
of the mature protease, the resulting monomeric 
enzyme can still perform N-terminal autocleavage, 
while dimerization and trans -cleavage activity are com¬ 
pletely inhibited by the Glu290Arg and Arg298Glu 
mutations and partly so by the Arg4Glu mutation. 
Furthermore, the mature Glu290Arg mutant can 
resume N-terminal autocleavage activity when mixed 
with an inactive M pro species, whereas its trans- cleav¬ 
age activity remains absent. Therefore the N-terminal 
autoprocessing of the M pro appears to require only 
two ‘immature’ monomers approaching one another to 
form an ‘intermediate’ dimer structure and does not 
depend on the active dimer conformation existing in 
the mature protease [43]. The octameric form of the 
immature M pro , which features a three-dimensional 
swap of the helical domain III of the enzyme [44], may 
play a role in the auto-activation process. 

Discovery and design of M pro 
inhibitors 

A large number of crystal structures have been pub¬ 
lished of inhibitor complexes of the SARS-CoV M pro , 
of which only a few can be mentioned here. Many 
types of chemical warheads have been used to achieve 
covalent binding of peptidic or peptidomimetic inhibi¬ 
tors to the active-site cysteine of the M pro , including 
the halomethylketones [30,36,45] and Michael acceptor 
compounds (oc,P-unsaturated esters) [33-35] mentioned 
above, aldehydes [46-49], a,P-epoxyketones [50-52], 
nitriles [53] and phthalhydrazide ketones [54,55]. All of 
these compounds are peptidomimetics carrying electro¬ 
philic warheads, and several also efficiently inhibit 
SARS-CoV replication in cell culture. Some of the 
inhibitors, such as for example halomethylketones, are 
certainly too reactive to be developed into drugs, as 
they are expected to exhibit considerable side-effects. 
One might intuitively assume the same of aldehydes, 
but in fact peptide aldehyde inhibitors of thrombin 
(such as efegatran) did not show toxicity in clinical tri¬ 
als [56,57]. Also, it should be noted that two hepatitis 
C virus NS3/NS4A protease inhibitors introduced into 
the market in 2011, telaprivir and boceprivir, are pep¬ 
tidomimetics carrying the a-ketoamide warhead [58], 
Finally, rupintrivir (AG7088) is an example of a 
Michael acceptor compound that was developed as an 
inhibitor of the 3C protease of human rhinovirus [31]. 
There is a trend away from non-covalent binders of 
target serine or cysteine proteases and towards 


FEBS Journal 281 (2014) 4085-4096 © 2014 FEBS 



R. Hilgenfeld 


Crystallography of coronavirus proteases 


covalent reversible or irreversible binders. Given the 
absolute requirement of the coronavirus M pro for glu¬ 
tamine in the PI position of the substrate, and the 
absence of human proteases with the same specificity, 
there is a good chance of developing coronavirus pro¬ 
tease inhibitors carrying electrophilic warheads without 
having to expect too many side-effects (see above). 
Figure 3 shows the binding of our broad-spectrum 
Michael acceptor compound SG85, which we origi¬ 
nally developed against the enterovirus 3C protease 
[59], in complex with the SARS-CoV M pro , as revealed 
by X-ray crystallography (Zhu et al., unpublished; 
PDB code 3TNT ). In agreement with the expectation 
outlined above, this compound shows no sign of toxi¬ 
city in Huh-T7 or Vero A cells (CC 5 o = 256 and 
190 |im, respectively [59]) or in mice (Leyssen, Neyts 
et al., unpublished), while it exhibits an IC50 of around 
2 |im both against the isolated SARS-CoV M pro and in 
a SARS-CoV replicon and of about 3.3 pM in SARS- 
CoV-infected Vero B4 cells (Zhu, Kusov, Muth et al., 
unpublished). 

In addition, a number of non-peptidic, reversible 
inhibitors of the main protease have been discovered 
by virtual screening and/or docking on the basis of 
the crystal structure; examples for such compounds 
are cinanserin [60], arylboronic acids [61], isatin 
derivatives [62], selected diarylsulfones [63] and a 
variety of others [63,64]. Other non-peptidic inhibi¬ 
tors, such as benzotriazole esters [40,65,66] and non¬ 
warheaded benzo [l-3]triazoles [67], were discovered 
by screening of chemical libraries and subsequent 
optimization of the hits by medicinal chemistry. 
Chloropyridyl esters have been derived from the ben¬ 
zotriazole esters and found to have good antiviral 
activity in cell culture [68]. 


The SARS-CoV papain-like protease 
(PL pr °): functions in the viral 
replication cycle and in antagonizing 
innate immunity 

The other protease encoded by the SARS-CoV gen¬ 
ome, the papain-like protease, is responsible for pro¬ 
cessing three cleavage sites in the N-terminal part of 
the polyproteins, to produce mature Nspl, Nsp2 and 
Nsp3 (Fig. 1). The cleavage specificity of the PL pro 
corresponds to the pattern (R/K)L(R/K)GGJ,X. In 
addition, the enzyme is a deubiquitinase, i.e. it 
removes (poly)ubiquitin units from proteins tagged 
with them [69,70]. Ubiquitin carries the sequence 
LRLRGG at its C-terminus, in perfect agreement with 
the coronavirus PL pro recognition motif. The deubiqu¬ 
itinase activity of the enzyme interferes, in an as-yet 
unknown way, with the phosphorylation and nuclear 
import of interferon-regulatory factor 3 (IRF3) and 
thereby prevents the production of type-I interferons 
by the infected host cell [71-73]. The SARS-CoV PL pro 
has also been shown to have deISG15ylating activity 
[74], i.e. it removes ISG15 units from target proteins 
labeled this way (ISG, interferon-stimulated gene prod¬ 
uct). Finally, the SARS-CoV PL pro has been demon¬ 
strated to interfere with the nuclear factor kB 
pathway, i.e. it is an important weapon of the virus in 
its efforts to counteract the innate immune response of 
the infected host cell [73]. 

Crystallographic studies on the SARS- 
CoV PL pro and inhibitor discovery 

The crystal structure of the SARS-CoV PL pro was 
reported by Ratia et al. [75]. The enzyme consists of 



Fig. 3. Stereo illustration of the Michael acceptor compound SG85 (Cbz-(tBu-0-)Ser-Phe-Glnl_actam-CH=CH-C0-0-Et; [591) bound to the 
substrate-binding site of the SARS-CoV M pr ° (Zhu et al., unpublished; PDB code 3TNT ). The inhibitor is shown in cyan (for carbon), blue (for 
nitrogen) and red (for oxygen). Hydrogen bonds are indicated by dashed lines and water molecules by small red spheres. The P4-P1' side- 
chains of the inhibitor are labeled. The PI side-chain is buried in the SI pocket and only the tip of its lactam moiety is visible in this 
illustration. 
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an N-terminal ubiquitin-like (Ubl) domain and a cata¬ 
lytic core domain that features an open-right-hand 
fold, with thumb, palm and fingers subdomains. At 
the tip of the fingers domain, a structural zinc ion is 
found within a zinc-ribbon structure (Fig. 4). It took a 
number of years to obtain a crystalline complex 
between the SARS-CoV PL pro and ubiquitin; only very 
recently, Chou et al. [76] published the structure of a 
complex between ubiquitin and a PL pro that had the 
catalytic cysteine residue replaced by serine, and Ratia 
et al. [77] reported the structure of the native SARS- 
CoV PL pro in complex with ubiquitin aldehyde, where 
the C-terminal aldehyde group forms a covalent bond 
with the catalytic Cysll2 of the enzyme. 

The use of peptidomimetic inhibitors to block the 
SARS-CoV PL pro is connected with the difficulty that 
such inhibitors would very likely also inhibit host-cell 
deubiquitinases, so that severe side-effects would have 
to be expected. Therefore, the search for inhibitors of 
the SARS-CoV PL pro focused on screening chemical 
libraries for non-peptidic, reversible inhibitors of the 
enzyme. This way, Ratia et al. [78] and Ghosh et al. 
[79,80] identified hit compounds that were further opti¬ 
mized to yield inhibitors with submicromolar activities 


against the isolated enzyme and low-micromolar activi¬ 
ties in SARS-CoV-infected cell cultures (see also [81]). 
The hit-to-lead optimization relied heavily on crystal 
structures of complexes between selected candidate 
inhibitors and the SARS-CoV PL pro . Several of the 
inhibitors discovered this way, e.g. GRL0617 [78], did 
not bind directly to the catalytic site of the protease 
but near the S3 and S4 sites (these are more spacious 
than the restricted SI and S2 sites, which can accom¬ 
modate exclusively glycine residues of the substrates, 
i.e. viral polyprotein or ubiquitin). Figure 4 shows the 
inhibitor GRL0617 (space-filling presentation) bound 
to the S3 and S4 sites, far from the catalytic triad 
(cyan sticks). 

Crystallographic and inhibitor 
discovery studies with bat 
coronavirus M pro : increasing the 
preparedness against zoonotic 
transmission 

As evidence was growing for a zoonotic transmission 
of SARS-CoV from bats via intermediate hosts to 
humans [17,18], we started to get interested in bat coro- 



Palm domain 


Fig. 4. Structure of the SARS-CoV PL pra in complex with the non-peptidic inhibitor GRL0617 [781. The domains of the enzyme are indicated 
and colored as follows: yellow, ubiquitin-like (Ubl) domain; pink, thumb domain; blue, palm domain; cyan, fingers domain. The zinc ion bound 
at the tip of the fingers domain is colored red. The inhibitor (space-filling presentation, with carbon in grey, nitrogen in blue and oxygen in 
red) binds to the S3 and S4 sites, far from the catalytic triad, Cl 12-H273-D287 (cyan sticks). 
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navirus main proteases as drug targets. Obviously, the 
goal is not to cure bats from their coronavirus infec¬ 
tions (being the reservoir, most bats do not show any 
sign of disease when they carry coronaviruses), but we 
want to design inhibitors for these enzymes to have 
them ready in case of a zoonotic transmission of a bat 
coronavirus into the human population. The idea is to 
design and synthesize one or more lead compound(s) 
with broad-spectrum anti-coronaviral activity, which 
can immediately enter preclinical development in the 
case of a major epidemic. At the outset of this project, 
we selected three bat coronaviruses as representatives 
for coronavirus families: Bt-CoV HKU8 as an alpha- 
coronavirus [82], Bt-CoV HKU4 as a betacoronavirus 
of clade c [83] and Bt-HKU9 as a betacoronavirus of 
clade d [83,84]. (We excluded Betacoronavirus clades a 
and b as no bat coronaviruses of the former are known 
and clade b is already presented by the well-studied 
SARS-CoV.) So far, we have determined the crystal 
structures of the M pro s of Bt-CoV HKU8 (Ma et ai, 
unpublished) and HKU4 (Xiao et ah, unpublished; 
PDB codes 2YNA , 2YNB ) and have noticed that our 
above-mentioned broad-spectrum antiviral SG85, a 
Michael acceptor compound [59], inhibited the HKU4 
(but not the HKU8) enzyme. Proof-of-principle for our 
‘predictive’ approach came when MERS-CoV emerged 
in 2012 and we found that SG85 was indeed a good 
inhibitor of this virus in cell culture (Xiao, de Wilde, 
Muth et al., to be published). BtCoV HKU4 turned 
out to be a close relative of MERS-CoV, with 81% 
amino acid sequence identity (90% similarity) for the 
main protease. 

The inactivity of SG85 against the E1KU8 M pi °, 
however, also suggests that our inhibitors have to 
become more broad spectrum than they are at present. 
Ideally, one would like to have one broad-spectrum 
antiviral at hand that would be efficacious against all 
coronavirus families. Modifications of SG85 with good 
activity against alphacoronaviruses are now under 
development in our laboratory. 

Structure-based inhibitor discovery 
against MERS coronavirus 

Just as for SARS-CoV, the main protease (M pro or 
3CL pro ) and the papain-like protease (PL pro ) are prime 
targets for the development of antivirals against the 
newly emerging MERS-CoV. A three-dimensional 
structure was described for the M pro shortly after the 
discovery of the new virus [85], but unfortunately 
atomic coordinates have not been deposited in the Pro¬ 
tein Data Bank. The same publication describes the 
SARS-CoV M pro inhibitor N3, a Michael acceptor com¬ 
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pound [33], as a good inhibitor of the MERS-CoV M pro 
[85]. The structure of the papain-like protease of the 
new virus has also been determined [86]. The enzyme 
features significant differences from the SARS-CoV 
PL pro . Thus, the stabilization of the oxyanion intermedi¬ 
ate of the proteolytic reaction catalyzed by the MERS- 
CoV PL pro appears to be different from the mechanism 
proposed for the SARS-CoV PL pro [75]. In papain-like 
proteases, the oxyanion is commonly stabilized by two 
hydrogen bonds from the enzyme, one donated by the 
main-chain amide of the catalytic residue, here Cyslll, 
and the other from a glutamine or asparagine side-chain 
five or six residues N-terminal to the catalytic cysteine. 
In SARS-CoV PL pro , the corresponding side-chain is 
that of Trpl07, which is proposed to donate a hydrogen 
bond to the oxyanion from the indole nitrogen [75]. But 
in the MERS-CoV PL pro this tryptophan is replaced by 
Leu 106, which lacks hydrogen-bonding capability. 
Interestingly, the Leul06Trp mutation of the MERS- 
CoV PL pro increases the peptidolytic and deubiquitinat- 
ing activities of the enzyme by factors of 60 and 3.4, 
respectively [86], indicating that the protease has not 
been optimized for maximum activity during evolution 
of the virus. Other differences between the SARS-CoV 
PL pro and the MERS-CoV PL pro include the S3 and S5 
specificity subsites. These subsites accommodate argi¬ 
nine residues of ubiquitin in the SARS-CoV PL pro - 
ubiquitin complex [76,77] and arginine or lysine at the 
PL pro cleavage sites in the viral polyprotein. Accord¬ 
ingly, the subsites are dominated by negatively charged 
amino acid side-chains in the SARS-CoV enzyme, i.e. 
Aspl65 in the S3 site and Glul68 in the S5 site. How¬ 
ever, in the MERS-CoV PL pro , the latter residue is 
replaced by the positively charged Argl68. Hence, direct 
extrapolation from the structure of the SARS-CoV 
PL pro -ubiquitin complex [76,77] to ubiquitin recognition 
by the MERS-CoV enzyme is not possible; rather, the 
crystal structure of the complex has to be awaited. 

Concluding remarks 

The response of the crystallographic community to 
the SARS outbreak has occasionally been described 
as ‘swift’; however, to be realistic, it should be noted 
that had we not determined the structures of the 
TGEV and HCoV-229E M pro , including that of an 
inhibitor complex, prior to the SARS outbreak, the 
response would probably have been significantly 
slower. Nevertheless, I hope that I was able in this 
review to illustrate the important role played by 
X-ray crystallography in elucidating the three-dimen¬ 
sional structures of two important targets for the 
discovery and development of anti-coronavirus drugs, 
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the main protease (M pro ) and the papain-like protease 
(PL pro ). In fact, most of the peptidomimetic inhibitors 
of the M pro were designed on the basis of the struc¬ 
tural knowledge of the enzyme, whereas several non- 
peptidic inhibitors were identified by using the crystal 
structure of the target for virtual screening of chemi¬ 
cal libraries. The known inhibitors of the PL pro , on 
the other hand, are mostly based on original hits 
identified in high-throughput screening or virtual 
screening campaigns against the recombinant enzyme, 
which were subsequently optimized according to their 
docking to the SARS-CoV PL pro or to the crystal 
structure of their complex with the target. However, 
none of the compounds directed against the coronavi¬ 
rus proteases has gone through a complete preclinical 
development program, mainly because of a sharp 
decline in funding in most countries in 2005-2006. 
Nonetheless, some of the inhibitors described so far 
are good starting points for development in the case 
of future zoonotic transmissions of coronaviruses into 
the human population, or in the case of a continua¬ 
tion of the MERS outbreak. 

It is occasionally argued that drug discovery should 
remain a domain of the pharmaceutical industry and 
not a priority in academia, as the former is undoubt¬ 
edly better at it. However, it should be realized that 
big pharma generally has little interest in emerging 
RNA viruses, because these typically cause self-limiting 
rather than chronic disease. Yet, these viruses poten¬ 
tially pose a big threat to man, as we were impres¬ 
sively taught by the SARS coronavirus [1], and we are 
well advised to increase our preparedness in view of 
the increasing frequency of outbreaks caused by these 
viruses [87,88]. Academic institutions have important 
tasks in these efforts at increasing preparedness, as far 
as the preclinical discovery phase of the drug develop¬ 
ment process is concerned [89]. Macromolecular crys¬ 
tallography will undoubtedly continue to play a major 
role in these efforts. 
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